Biophysics Tools and Techniques for the Physics of Life (Mark C. Leake) (1)

324 8.2 Molecular Simulation Methods

Monte Carlo techniques rely on repeated random sampling to generate a numerical

output for the occurrences, or not, of specific events in a given biological system. In

the most common applications in biology, Monte Carlo methods are used to enable sto

chastic simulations. For example, a Monte Carlo algorithm will consider the likelihood

that, at some time t, a given biological event will occur or not in the time interval t to t

+ Δt where Δt is a small but finite discretized incremental time unit (e.g., for combined

classical MD methods and Monte Carlo methods, Δt ~ 10⁻¹⁵ s). The probability Δp for

that specific biological event to occur is calculated using specific biophysical parameters

that relate to that particular biological system, and this is compared against a probability

prand that is generated from a pseudorandom number generator in the range 0–1. If Δp

> prand, then one assumes the event has occurred, and the algorithm then changes the

properties of the biological system to take account of this event. The new time in the sto

chastic simulation then becomes t′ = t + Δt, and the polling process in the time interval

t′ to t′ + Δt is then repeated, and the process then iterated over the full range of time to

be explored.

In molecular simulation applications, the trial probability Δp is associated with trial moves

of an atom’s position. Classical MD, in the absence of any stochastic fluctuations in the simu

lation, is intrinsically deterministic. Monte Carlo methods, on the other hand, are stochastic

in nature, and so atomic positions evolve with a random element with time (Figure 8.2).

Pseudorandom small displacements are made in the positions of atoms, one by one, and

the resultant change ΔU in potential energy is calculated using similar empirical potential

energy described in the previous section for classical MD (most commonly the Lennard–

Jones potential). Whether the random atomic displacement is accepted or not is usually

determined by using the Metropolis criterion. Here, the trial probability Δp is established

using the Boltzmann factor exp(−ΔU/kBT), and if Δp > the pseudorandom probability (prand),

then the atomic displacement is accepted, the potential energy is adjusted and the pro

cess iterated for another atom. Since Monte Carlo methods only require potential energy

calculations to compute atomic trajectories, as opposed to additional force calculations, there

is a saving in computational efficiency.

This simple process can lead to very complex outputs if, as is generally the case, the bio

logical system is also spatially discretized. That is, probabilistic polling is performed on

spatially separated components of the system. This can lead to capturing several emergent

properties. Depending on the specific biophysics of the system under study, the incremental

time Δt can vary, but has to be small enough to ensure subsaturation levels of Δp that is,

Δp < 1 (in practice, Δt is often catered so as to encourage typical values of Δp to be in the

range ~0.3–0.5). However, very small values of Δt lead to excessive computational times.

Similarly, to get the most biological insight ideally requires the greatest spatial discretization

of the system. These two factors when combined can result in requirements of significant

parallelization of computation and often necessitate supercomputing, or high performance

computing (HPC) resources when applied to molecular simulations. However, there are still

several biological processes beyond molecular simulations that can be usefully simulated

using nothing more than a few lines of code in any standard engineering software language

(e.g., C/C++, MATLAB^®, Fortran, Python, LabVIEW) on a standard desktop PC.

A practical issue with the Metropolis criterion is that some microstates during the

finite duration of a simulation may be very undersampled or in fact not occupied at all.

This is a particular danger in the case of a system that comprises two or more inter

mediate states of stability for molecular conformations that are separated by relatively

large free energy barriers. This implies that the standard Boltzmann factor employed

of exp(−ΔU/kBT) is in practice very small if ΔU equates to these free energy barriers.

This can result in a simulation being apparently locked in, for example, one very over

sampled state. Given enough time of course, a simulation would explore all microstates

as predicted from the ergodic hypothesis (see Chapter 6); however, practical computa

tional limitations may not always permit that.

To overcome this problem, a method called “umbrella sampling” can be employed. Here a

weighting factor w is used in front of the standard Boltzmann term: